Feature Subset Selection Algorithm for High Dimensional Data using Fast Clustering Method

نویسندگان

  • A. GowriDurga
  • A. Gowri Priya
چکیده

Feature selection means finding most useful features and it will produce suitable results among entire set of features. An algorithm is used to selecting a feature and it may be evaluated from both efficiency and effectiveness point of view. Efficiency is related to the time required to find a subset of features while the effectiveness is related to quality of subset of features. Based on these, we proposed a fast clustering-based feature selection algorithm (FAST). FAST algorithm performs in two steps. First of all, features are divided into various clusters. Then the most useful feature is selected from each cluster. We adopt the minimum spanning tree (MST) to increase the efficiency of FAST. Many useful feature selection algorithms such as FCBF, Relief, CFS, Consist, FOCUS-SF are compared to FAST algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features

Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...

متن کامل

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

A New Hybrid Feature Subset Selection Algorithm for the Analysis of Ovarian Cancer Data Using Laser Mass Spectrum

Introduction: Amajor problem in the treatment of cancer is the lack of an appropriate method for the early diagnosis of the disease. The chemical reaction within an organ may be reflected in the form of proteomic patterns in the serum, sputum, or urine. Laser mass spectrometry is a valuable tool for extracting the proteomic patterns from biological samples. A major challenge in extracting such ...

متن کامل

Algorithm For Identifying Relevant Features Using Fast Clustering

In the high dimensional data set having features selection involves identifying a subset of the most useful features that produce compatible results as the original entire set of features. A fast algorithm may be evaluated from both the ability concerns the time required to find a subset of features and the value is required to the quality of the subset of features. Fast clustering based featur...

متن کامل

Fast Feature subset selection algorithm based on clustering for high dimensional data

A Feature selection algorithm employ for removing irrelevant, redundant information from the data. Amongst feature subset selection algorithm filter methods are used because of its generality and are usually good choice when numbers of features are large. In cluster analysis, graph-theoretic clustering methods to features are used. In particular, the minimum spanning tree (MST)based clustering ...

متن کامل

High Dimensional Data Clustering Using Fast Cluster Based Feature Selection

Feature selection involves identifying a subset of the most useful features that produces compatible results as the original entire set of features. A feature selection algorithm may be evaluated from both the efficiency and effectiveness points of view. While the efficiency concerns the time required to find a subset of features, the effectiveness is related to the quality of the subset of fea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014